An Optimal Hierarchically Clustering Number Determining Method ?

نویسندگان

  • Hongfang ZHOU
  • Xuehan ZHAO
  • Hongyan LI
  • Peng WANG
  • Zhentao QIN
چکیده

In the hierarchical clustering algorithms, it has become a basic difficult problem to determine the optimal clustering number in the dataset, as a result of the influence of outliers and noise points. Therefore, we propose a method to remove these interferential data in two stages in the hierarchical clustering algorithm, which is based on the traditional noise data removal method. Furthermore, we can obtain the optimal number of clusters. Theoretical analysis and experimental results have verified the effectiveness and good performance of the algorithm.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A descriptive method to evaluate the number of regimes in a switching autoregressive model

This paper proposes a descriptive method for an open problem in time series analysis: determining the number of regimes in a switching autoregressive model. We will translate this problem into a classification one and define a criterion for hierarchically clustering different model fittings. Finally, the method will be tested on simulated examples and real-life data.

متن کامل

A meta-heuristic clustering method to reduce energy consumption in Internet of Things

The Internet of Things (IoT) is an emerging phenomenon in the field of communication, in which smart objects communicate with each other and respond to user requests. The IoT provides an integrated framework providing interoperability across various platforms. One of the most essential and necessary components of IoT is wireless sensor networks. Sensor networks play a vital role in the lowest l...

متن کامل

An Expansion of -means for Automatically Determining the Optimal Number of Clusters

We expand a non-hierarchical clustering algorithm that can determine the optimal number of clusters by using iterations of -means and a stopping rule based on Bayesian Information Criterion (BIC). The procedure requires merging the clusters that a -means iteration has made to avoid unsuitable division caused by the division order. By using this additional merging operation, the case of adequate...

متن کامل

Determining the Best K for Clustering Transactional Datasets: A Coverage Density-based Approach

The problem of determining the optimal number of clusters is important but mysterious in cluster analysis. In this paper, we propose a novel method to find a set of candidate optimal number Ks of clusters in transactional datasets. Concretely, we propose Transactional-cluster-modes Dissimilarity based on the concept of coverage density as an intuitive transactional inter-cluster dissimilarity m...

متن کامل

Determining the Best K for Clustering Transactional Datasets: A Coverage Density-based Approach

The problem of determining the optimal number of clusters is important but mysterious in cluster analysis. In this paper, we propose a novel method to find a set of candidate optimal number Ks of clusters in transactional datasets. Concretely, we propose Transactional-cluster-modes Dissimilarity based on the concept of coverage density as an intuitive transactional inter-cluster dissimilarity m...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2012